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STATEMENT OF FOCUS 



The Wisconsin Research and Development Center for Cognitive Learning 
focuses on contributing to a better understanding of cognitive learning by chil- 
dren and youth and to the improvement of related educational practices . The 
strategy for research and development is comprehensive. It includes basic 
research to generate new knowledge about the condition^ and processes of 
learning and about the processes of instruction, and the' subsequent develop- 
ment of research-based instructional materials, many of which are desigiied 
for use by teachers and others for use by students. These materials are tested 
and refined in school settings. Throughout these operations behavioral scien- 
tists, curriculum experts, academ.ic scholars, and school people interact, 
insuring that the results of Center activities are based soundly on knowledge 
of subject matter and cognitive learning and that they are applied to the improve- 
ment of educational practice. 

This Technical Report is from the Language Concepts and Cognitive Skills 
Related to the Acquisition of Literacy Project in Program 1 . General objectives 
of the Program are to generate new knowledge about concept learning and cogni- 
tive skills, to synthesize existing knowledge, and to develop educational mate- 
rials suggested by the prior activities. Contributing to these Program objectives, 
this project's basic goal is to determine the processes by which children aged 
four to seven learn to read, examining the development of related cognitive and 
language skills, and to identify the specific reasons why many children fail to 
learn to read. Later studies will be conducted to find experimental techniques 
and tests for optimizing the acquisition of skills needed for learning to read. 
By-products of this research program include methodological innovations in 
testing paradigms and measurement procedures; the present study is an example. 
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ABSTRACT 



This study examined variation in transcriber disagreement as a function 
of transcriber's linguistic background, the transcription task, and the nature 
of judgment involved. Three linguistics students trained in phonetic tran- 
scription, one a non-native speaker of English, listened to the same tapes 
of Midwestern Kindergarteners pronouncing lists of common words. Trans- 
cription task varied with order of listening; the first transcriber listened for 
errors of articulation and transcribed them in broad phonetic notation. The 
other two transcribers served as checkers of the first transcription. The 
first checker independently transcribed errors for the words in which the 
first transcriber had found errors [an undifferentiated sample of words found 
correct was included in the first checker's set of items]. The second checker 
listened to words for which the two transcriptions differcjd and selected one 
of two uranscriptions as correct, or added her own. Five protocols for each 
of the six possible combinations of first transcriber, first checker, and second 
checker were selected and examined for disagreement. 

Disagreements between the fir-st transcriber and first checker varied as a 
function of task and judgment but not as a function of the individuals' linguis- 
tic backgrounds. The first transcriber adopted the stricter criterion of correct 
pronunciation; the first checker appeared to expect an error in each word 
heard, with a consequently greater disagreement rate for sounds judged cor- 
rect by the first transcriber when they appeared in words judged correct. The 
judgment of whether or not a sound in a word was mispronounced produced, 
at most, only half as many disagreements as the selection of a particular 
transcription for a sound thought to be in error by both transcribers. In the 
latter disagreements, the first checker's transcription was selected as cor- 
rect by the second checker 70% of the time, irrespective of the identity of 
the checkers. On the basis of these findings, it is argued that a correction 
procedure for transcription is necessary for any study of articulation assess- 
ing the nature of the errors „ 
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INTRODUCTION 



Analysis of speech, the sine qua non for 
accurate studies of language habits, begins, 
for practical reasons, with thb reduction of 
a complex acoustical stream to a visual record. 
For analysis of certain acoustical features of 
speech and for limited and not very accurate 
recognition of segmental units, electronic 
devices can be employed. But on the major- 
ity of those occasions when speech is exam- 
ined, the human listener is enlisted for the 
transcription stage. The ability of listeners 
with linguistic training to transcribe speech 
accurately is the subject of the present in- 
vestigation. 

The purpose of this study is to examine 
transcriber ability by determining the degree 
of disagreement between pairs of trained lin- 
guists in the transcription of data gathered in 
a developmental study of articulation. Tran- 
scriber disagreements are also examined for 
systematic variation as a function of tran- 
scriber characteristics and transcription task. 



STUDIES OF RELIABILITY 

Linguists, while aware of the possibilities 
of observer bias in recording speech, have 
generally attempted to optimize transcriber 
accuracy through training , without special 
concerns for the theoretical degree of relia- 
bility possible or the actual degree of relia- 
bility obtained, In some studies, variation 
among transcribers [though not inconsistency 
in a single transcriber's work] was eliminated 
through the use of a single, well-trained lin- 
guist who did all of the transcribing. Such 
was the procedure adopted in the late 19th Cen- 
tury for a dialect atlas of French, when Edmont 
Edmond was sent by bicycle through France 
and adjoining areas to interview some 600 in- 



formants , ^ Most large dialect studies, how- 
ever, have enlisted teams of fieldworkers , 
depending upon past experience and training 
to produce agreement among them. The first 
group of fieldworkers for the Linguistic Atlas 
of the United States, mostly linguists by 
trade, undertook 6 weeks of extensive train- 
ing in fieldwork in the summer of 1930 before 
beginning to transcribe the nuances of New 
England speech. Following the completion of 
the New England fieldwork, each of the two 
directors of the project rank-ordered the field- 
workers (eight, including themselves) on a 
number of specific skills, including: 

1, Minuteness in phonetic recording; 

2, Freedom from systematization accord- 
ing to the fieldworker',? phonemic sys- 
tem; 

3, Freedom from systematization accord- 
ing to the informant's phonemic system; 

4, Avoidance of over-transcription; and 

5 , Accuracy in recording quantity and stres 

There was considerable variation in the order- 
ir.gs for several of these categories, indicat- 
ing some independence among these skills. 

No attempt was made, however, to assess 
overall reliability by comparing transcriptions 
of a common source. 

In psychologically oriented studies where 
speech transcriptions are done, reliability 
checks range from none to extensive indepen- 
dent studies, Templin (1957), in a study of 
various language skills in 480 children, dis- 
pensed completely with reliability or consis- 
tency checks: 



1 

See J, Gillieron and E, Edmond, Atlas 
linquistique de la France, Paris, 1902-10, 
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Since the author has had a substantial 
amount of training and experience in 
the use of phonetics and a good deal 
of experience in the testing of speech 
of young children, and because of the 
difficulties in introducing a third per- 
son into the test situation with pre- 
school children, no reliability check 
was made. Repeated tests in the same 
child would not provide several judg- 
ments of the same utterance. 

Henderson (1935), in a study of articula- 
tion in normal institutionalized children, 
measured consistency, i.e. , within-transcriber 
reliability, by comparing two transcriptions 
which she made of the same phonographic re- 
cording . She found that 98,2% of her own 
transcriptions were identical. While this 
technique does measure intra-judge consis- 
tency, it does not evaluate accuracy. Con- 
sistently aberrant broad transcriptions could 
rank higher on this measure than accurate 
narrow transcriptions varying in some minute 
detail of vowel length or degree of voicing. 

In a separate study, Henderson (1937) had 
three judges record five consonants which 
occurred in real words pronounced for them 
by a young child. Two instances of each 
consonant in initial, medial, and final posi- 
tions were heard. When judges' responses 
were limited to "correct" or "incorrect," the 
three judges agreed 80% of the time; when a 
phonetic representation of the sound was re- 
quired, however, three-way agreements 
dropped to 7 2%. When the judges performed 
the Game tasks, but with a 5-year old child 
heard from a loud-speaker rather than seen, 
the agreement figures dropped to 69% and 
60%, respectively. Nevertheless, these tasks 
are considerably easier than those the tran- 
scriber faced in the earlier Henderson study, 
which required transcription of whole utter- 
ances, rather than a single consonant from 
each. 

Irwin and Curry (1941) and Irwin and Chen 
(1941) tested agreement between pairs of 
transcribers who were recording crying vocal- 
izations of infants under 10 days of age. 

While the agreements were relatively high 
(85% for vowels and 94% for consonants), the 
task is quite distinct from recording real 
speech. The chief difference is that the 
repertoire of crying sounds is small. Fur- 
thermore, while both sets of authors made 
transcriptions of specific phonemes such as 
/as/ and /i/, the existence of such entities 
in the speech of 1- to 9-day-old children is 
doubtful. Phonemes depend for their exist- 
ence upon a system of contrasts which func- 



tion to separate meanings. It is highly unlikely 
that any of the sounds made by a 1- to 9-day- 
oid child meet this criterion. While crying may 
contain vowel-like sounds, it does not contain 
vowels in the same sense that the speech of a 
normal 5-year old does. The acoustic patterns 
are distinct. At best, one can interpret the 
phonemes presented in these studies as indi- 
cators of the allophones which most closely 
approximate the sounds which the children 
emitted . 



FACTORS WHICH INFLUENCE TRANSCRIPTION 

The factors which are most important for 
agreement between transcribers appear to be 
hearing, training in phonetics, familiarity with 
the speech to be transonbed, and degree of 
detail required. While detec-tion of sound pres- 
sure and frequency differences are important for 
hearing any sound, little is known about hear- 
ing abilities specifically related to speech re- 
ception, aside from lateralization, which is 
peculiar to dichotic listening, e.g., Kimura 
(1961) . 

Training in phonetics, like training in art, 
is a necessary prerequlsxte for detecting de- 
tails in the material. Just as the untrained eye 
will be unaware of the nuances of brush stroke, 
perspective, and division of space, so will the 
untrained listener be unaware of the finer shades 
of aspiration, nasalization, and devoicing, and 
how to represent them in writing. Along with 
training in general phonetics must go training 
or familiarization in the language or dialect 
being transcribed. 

A fieldworker will often detect a difference 
between two sounds, yet not know, on first 
listening, the exact nature of the difference. 
Through repeated listening, comparisons to 
similar sounds, and attempts at imitation, he 
will generally uncover the phonetic basis for 
the distinction. From that point on he will be 
attuned for such forms. An untrained field - 
worker can often perform reasonably well v/hen 
transcribing a familiar dialect in broad phonetic 
or phonemic notation, but will begin to trip 
over unfamiliar speech patterns or the need to 
mark narrow details . 

In testing inter-judge reliability, there- 
fore, it is important to note transcriber, speech 
stimulus, and task characteristics . The three 
transcribers compared in the present study were 
all trained in phonetic transcription, but one 
was a non-native speaker of English; all were 
living in the dialect area of the speech tran- 
scribed. The materials to be transcribed were 
tape recordings of young children repeating com- 
mon words; the transcribers could listen to each 
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recording as often as they chose. The first 
transcriber listened for errors of articulation 
and transcribed in broad phonetic notation 
(International Phonetic Alphabet) the substi- 
tuted or inserted sounds, or noted deletions. 
The second transcriber independently tran- 
scribed errors for the subset of words in 
which the first transcriber had found errors, 
plus an undifferentiated random selection of 
words for which no errors had been tran- 



scribed. The third decided between differ- 
ing transcriptions made by the first and sec- 
ond transcribers. Transcribers could disa- 
gree in judgment at two levels: they could 
disagree about whether a sound was in error; 
or agree that the sound was in error but 
disagree over its transcription. The data 
were analyzed for differences in agreement 
as a function of transcriber identity, task, 
and judgment. 



II 

METHOD AMD RESULTS 



A comparison was made of transcriptions 
of three transcribers who listened to the same 
tape-recorded materials and recorded errors of 
articulation in broad IPA. The transcribers 
were: 

A; A female undergraduate in linguistics, 
who had spent most of her life in New Jersey 
aside from 4 years of high school in Florida. 

She had studied both French and Russian for 
5 years, Chinese for 1 year, German for half 
a year, and had had approximately 1 year of 
experience in phonetic transcription [aside 
from course work in linguistics]. 

B; A female graduate student in linguistics, 
who was raised in Northern Virginia, but at- 
tended college in Massachusetts. She studied 
German for 7 years including 1 year in Germany, 
French for 5 years, Chinese for 2 years, Span- 
ish, Latin, and Greek for 1 year, and Hindi for 
half a year. She had had approximately 2 years 
of experience in phonetic transcription [aside 
from course work in linguistics]. 

C: A female graduate student in linguistics, 
born in Peking, China, who speaks two dia- 
lects of Chinese — Pekingese and Cantonese. 
She studied British English as a foreign lan- 
guage in high school in Hong Kong, and Ameri- 
can English in college in Taiwan. She then 
spent 2 years in New York City and 2 years in 
Providence, R. I. In addition to English, she 
studied French for 1 year and had had approxi- 
mately 2 years of experience in phonetic tran- 
scription [aside from course work in linguistics]. 

The materials to be transcribed were tapes 
of Midwestern Kindergarten children repeating 
standardized lists of common words. The ar- 
ticulation lists included two random orderings 
of each of two different 48 word lists. Most 
were high-frequency words, chosen to test each 
vowel of English in at least two environments, 
single consonants and consonant clusters in 
initial and final position, and three-item con- 
sonant clusters in initial position. (The two 
lists of words are included in the Appendix.) 



Ss' pronunciations were recorded on a Uher 
5000 tape recorder with a Shure lavaliere mi- 
crophone at 3 3/4 ips. Tapes were listened 
to on a Uher 5000 tape recorder. The tran- 
scriber could listen to an item as many times 
as she needed, and had data sheets giving the 
spelling of each stimulus word, but not a 
phonetic representation. Transcribers recorded 
any errors detected in broad IPA; appropriate 
pronunciations of words were not transcribed. 
Transcription symbols were limited to the 
phonemes of English plus glottal stops, bila- 
bial fricatives, and aspiration diacritics. No 
judgments of vowel lengthening or shortening, 
final release, weakly articulated consonants, 
or other narrow phonetic details were required. 

One of the three transcribers listened to the 
S. repeat the whole word list, transcribing errors 
in broad IPA; the other two served as checkers. 
All words in which the first transcriber found 
an error were marked for future checking . Fur- 
thermore, 10% of a subject's responses were 
randomly selected from words for which no 
error had been transcribed by the first tran- 
scriber and added to the items to be checked; 
these were not differentiated from error words. 
The first checker covered up the original tran- 
scription, listened to the items to be checked, 
and transcribed the errors or indicated that no 
error was present. If the first checker disa- 
greed with the original transcriber, then the 
second checker listened to the item and indi- 
cated which of the two transcriptions she 
agreed with. She could also add a third tran- 
scription if she could not agree with either of 
the first two . 

The possible patterns of disagioement aris- 
ing in the course of checking are as follows: 

(T1 stands for the first transcriber; Cl, the 
first checker; C2, the second checker.) 

Case 1 . Cl agrees with T1 . No dis- 
agreements . 

Case 2. Cl disagrees with T1 . 
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Table 1 



2a. C2 agrees with T1 . Two~ 
person disagreement. 

2b. C2 agrees with Cl. Two- 
person disagreement. 

2c. C2 adds a new transcription. 
Three-person disagreement. 

Since each of the three transcribers could 
serve as first transcriber, first checker, or 
second checker, five transcription sheets with 
checks were randomly selected, from Kinder- 
garten subjects, for each of the following 
combinations of transcriber, first checker, 
and second checker (A, B, and C refer to the 
transcribers). Thus, 30 transcription records 
comprise the sample for this study. 

Combination T1 Cl C2 



1 

2 

3 

4 

5 

6 



A 

A 

B 

B 

C 

C 



B 

C 

A 

C 

A 

B 



C 

B 

C 

A 

B 

A 



Transcriber Disagreement by Phoneme for 
Words in which T1 Found an Error® 



Tl - Cl 


No . of 
Phonemes 
Listened 
to by 
Tl - Cl 


No. of 
Phoneme 
Disagree- 
ments 


Percentage 


A - B 


252 


32 


1 2 . 7% 


0 

1 

< 


414 


40 


9 . 6% 


B - A 


181 


16 


8 . 8% 


B - C 


208 


24 


11.5% 


C - A 


313 


37 


1 1 . 8% 


C - B 


311 


28 


9 . 0% 


Total 


1679 


177 


10.5% 



Five protocols are represented in each 
Tl-Cl category. 



RESULTS 

A tabulation of two-person transcriber dis- 
agreements by phoneme for the words in which 
T1 found an error is given in Table 1 . The 
count for phonemes listened to by both T1 and 
Cl was based on the expected number of 
phonemes for correct responses to each word. 

In Tables 2-4, these disagreements for 
words in which T1 transcribed an error are 
broken down as follows; a) T1 and Cl tran- 
scribed a different error for the same expected 
sound (Table 2); b) Cl found the sound T1 
transcribed as an error to be correct (Table 3); 
Cl found an error in one of the sounds con- 
sidered correct by T1 (Table 4) . 

Disagreements between T1 and Cl could 
also arise when Cl found a phoneme error in 
the sample of words for which T1 had recorded 
no error. In Table 5 are presented the phoneme 
disagreements for words indicated to be correct 
by T1 and checked by Cl . Again, the count 
for phonemes listened to by both T1 and Cl was 
based on the expected number of phonemes for 
correct responses to each word. 

The phoneme disagreement rates reported in 
Tables 2-5 show little variation as a result of 
transcriber pairing but marked variation be- 
tween tables. Most disagreements arose over 
sounds transcribed as errors by Tl; Cl disa- 



-Table 2 

Transcriber Disagreement on Phonemes 
Transcribed as Errors by both Tl and Cl® 



Tl - Cl 


No . of 

Phoneme 

Errors 

Transcribed 
by Tl 


No. of 
Phoneme 
Disagree- 
ments 


Percent- 

age 


A - B 


72 


18 


25.0% 


0 

1 

< 


131 


25 


19.0% 


B - A 


57 


10 


17.5% 


B - C 


63 


19 


30.1% 


C - A 


91 


25 


27.4% 


C - B 


85 


21 


24.7% 


Total 


499 


118 


23.6% 



Five ^ protocols are represented in each 
Tl-Cl category. 



greed witn Tl's transcription 163 out of 499 times, 
or 33.6% of the time. The majority (70%) of these 
disagreements occurred because Cl transcribed a 
different error than Tl for the sound in question 
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Table 3 



Table 5 



Transcriber Disagreement on Phonemes Found 
Incorrect by T1 and Correct by Cl® 



T1 - Cl 


No. of 

Phoneme 

Errors 

Transcribed 
by T1 


No. of 
Phoneme 
Disagree- 
ments 


Percent- 

age 


A - B 


72 


13 


1 8 . 0% 


O 

I 

< 


131 


1 2 


9 . 0% 


B - A 


57 


5 


8 . 0% 


B - C 


63 


5 


8 . 0% 


< 

I 

O 


91 


9 


9 . 8% 


C - B 


85 


7 


8 . 0% 


Total 


•499 


51 


1 0 . 2% 



^Five ^ protocols are represented in each 
Tl-Cl category. 



Transcriber Disagreement on Phonemes in 
Sample of Words Treated as Correct By Tl® 



T1 - Cl 


No . of 
Phonemes 
Listened 
to by 
Tl-Cl 


No . of 
Phoneme 
Disagree- 
ments 


Percent- 

age 


A - B 


117 


9 


7 . 6% 


O 

I 

< 


78 


2 


2 . 5% 


B - A 


76 


2 

1* 


2 . 6% 


B - C 


87 


4 


4 . 5% 


< 

1 

O 


82 


0 


0 . 0% 


C - B 


119 


9 


7 . 5% 


Total 


559 


26 


4 . 6% 



®Five ^ protocols are represented in each 
Tl-Cl category. 



Table 4 

Transcriber Disagreement on Phonemes 
Found Correct by T1 in Words for which 
T1 Transcribed an Error® 



T1 - Cl 


No. of 
Phonemes 
Listened 
to by 
Tl-Cl 


No . of 
Phoneme 
Disagree- 
ments 


Percentage 


A - B 


180 


1 


0 . 5% 


A - C 


283 


3 


0 . 1% 


B ■ 


- A 


124 


1 


O 

CD 


B • 


- C 


145 


0 


0 . 0% 


C - A 


222 


3 


0 . 1% 


C - B 


226 


0 


0 . 0% 


Total 


1180 


8 


0 . 6% 



^Five ^ protocols are represented in each 
Tl-Cl category. 



(Table 2). Cl disagreed with Tl's judgment 
that the sound was in error 51 out of 499 
times, or 10.2% (Table 3). In contrast for 



those words in which T1 found an error, Cl 
recorded an error for the sounds of the word 
judged correct by T1 only . 6% of the time 
(Table 4). The disagreement rate rises when 
words which T1 found correct are considered; 
here, Cl disagreed with Tl's judgment of no 
phoneme error 4.6% of the time (Table 5). 

The phoneme disagreements between T1 
and Cl were resolved by a second checker 
(C2), who chose between the two transcrip- 
tions or added one of her own, which was 
taken as the authorative transcription. 

Among the 177 phoneme disagreements of 
T1 and Cl arising for the words in which 
T1 had recorded an error, C2 agreed with 
T1 44 times, or 25%, irrespective of the 
identity of C2. C2 agreed with Cl 124 
times, or 70%, again irrespective of the 
identity of C2. C2 added a new transcrip- 
tion only 5% of the time, irrespective of 
the identity of T1 , Cl, and C2, The pro- 
portion of times C2 agreed with each tran- 
scriber serving as T1 or Cl is given in 
Table 6. Most of the variation is accounted 
for by the task (T1 or Cl) of the transcriber, 
rather than her identity; a particular C2 
always agreed more often with another tran- 
scriber when she served as Cl . In the case 
of A serving as C2, hov^rever, there is evi- 
dence of a tendency to agree with B more 
often than C. 
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Table 6 



Proportion of Second Checker's Choices of Transcription for 
Tl-Cl Disagreements in Words in Which T1 Found an Error® 



Second Checker 
Agreed With: 


Second Checker 


A 


B 


C 


A 

as T1 only 
as Cl only 




(0.05)'^ 


— 


.43 

.30 

.70 


.46 

.28 

.88 


B 

as T1 only 
as Cl only 




.68 


.48 

.85 


(.OB)*^ 


.49 

.07 

.67 


C 

as Tl-only 
as Cl only 




.27 


.10 

.48 


.52 

.20 
. 5 6 


(„05)*^ 


T1 




.22 




.28 


.26 


Cl 




.73 




.67 


.69 


(C2) 




(.05) 




(.05) 


(.05) 



Five ^ protocols are represented in each Tl-Cl category. 



Five percent of the time, the second checker selected her own transcription as correct 
rather than Tl's or Cl's. ’ 
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Ill 

DISCUSSION 



In the present study of transcriber varia- 
tion, transcribers did not see the children 
pronouncing words but were allowed to listen 
to each word as often as they needed. Hen- 
derson's study (1937) indicates that loss of 
visual information reduces transcriber agree- 
ment when only one opportunity to listen to 
the word is given. When more opportunities 
are given, one would expect an increase in 
agreement, but there are no data on this point. 

A relatively broad IPA transcription was re- 
quired; more detailed transcription would pre- 
sumably increase disagreement rate. Under 
these conditions of transcription, the most 
striking finding is the difference in disagree- 
ment rate as a function of the transcriber's 
task and judgment required, rather than the 
background of individual transcribers! For 
sounds found correct by the first transcriber, 
two-person disagreements ranged from .6%, 
when the first transcriber had found an error 
elsewhere in the word, to 4.6%, when she 
had found no error in the word. For sound 
transcribed as errors by the first transcriber 
two-person disagreements rose to 33.8%; 

23.6% in which the first checker provided a 
different transcription of the error and 10.6% 
in which she thought the sound correct, rather 
than in error. These patterns of disagreement 
were true for any pair of transcribers . 

Recall that the first checker's primary func- 
tion was to check the first transcriber's tran- 
scription of errors; that is, she expected to 
find errors in most of the words she listened 
to. The difference in disagreement for sounds 
found correct by the first transcriber (.6% when 
T1 had found an error elsewhere in the word; 
4.6% when not) appears to stem from this ex- 
pectation: the first checker, in effect, lis- 
tened especially hard for errors in words until 
she found one — thus perhaps increasing the 
probability of recording one for non-error 
words but decreasing it for sounds in words 
for which she had already located an error. 



That the first checker considered 10.2% of 
the errors identified by T1 to be correct pro- 
nunciations, rather than errors, indicates that 
the individual adopted a sterner criterion for 
correct pronunciation as first transcriber than 
as first checKer, reflecting the attitude that 
the primary burden of transcription was on the 
first transcriber. These differences in disagree- 
ment over whether a sound is correct or incor- 
rect, then, appear to arise from the task 
assigned the individual for a given protocol. 
Differences among the three individuals when 
they all have the same task assignment (c.g . , 
first checker) are minimal; the different disa- 
greement rates related to task, however, are 
marked even within an individual. 

Variation in disagreement can not only be 
traced to task differences and associated ex- 
pectations, but also to differences in the nature 
of the judgment required. A transcriber must 
determine a) whether a given sound in a word 
is mispronounced and b) if it is, decide exactly 
what the error is and transcribe it. The first 
judgment is considerably easier than the sec- 
ond, according to both Henderson's data (1937) 
and the present study. Maximum disagreement 
over whether a sound was correctly pronounced 
or not was 10 . 2%, when the first transcriber 
judged the pronunciation incorrect. Given that 
the transcriber' and checker agreed that a sound 
was in e»""or, however, they disagreed as to the 
exact transcriptic.i of the sound '^3.6% of the 
time . 

The high rate of disagreement over the 
exact transcription for an error may simply 
reflect the difficulty in choosing a transcrip- 
tion for a sound that may not even be an English 
alio phone, or it may additionally reflect inaccur 
acies associated with the task of first tran- 
scriber. If the first checker was listening with 
the expectation of hearing an error, then pre- 
sumably more of her attention could be given 
to the exact nature of the mispronunciation and 
less to the correct-incorrect decision required 
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of the first transcriber. Our data indicate 
that all three transcribers regard their col- 
leagues' transcriptions as more accurate when 
the colleagues have served as first checker 
rather than first transcriber — 70% of the time 
the first checker's transcription is chosen. 

In summary, the data indicate that tran- 
scriber agreement is a function of the task 
assigned and the specificity of judgment re- 
quired but not, in this case, of the individual's 
language background. The first transcriber 
used a stricter criterion for correct pronuncia- 
tion than the checkers and noted errors at the 
cost of noting their exact nature. The data 
suggest that a two-part transcription proce- 
dure would be more reliable for the transcrip- 
tion of articulatory errors, when the material 
can be listened to more than once: a first 
pass through the material in which the tran- 



scriber's sole task is to mark the places in 
which he hears mispronunciations, and a sec- 
ond pass through the material to transcribe 
those errors. Such a procedure would allow 
the first transcriber to devote his attention to 
the exact nature of the mispronunciation. 
Whether such a transcription procedure is 
adopted or not, it should be noted that fully 
25% of the first transcriber's error transcrip- 
tions were ultimately changed. The fact that 
each individual was agreed with more often 
as first checker than as first transcriber indi- 
cates that these changes can be regarded as 
corrections. The magnitude of their number 
would seem to make such correction procedures 
mandatory for studies which focus on the nature, 
as opposed to the number, of articulatory errors 
in children's speech. 
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Appendix 



Articulation Test 67-FA 
Order I 

Condition Voice 

No. Name Age 

Tape No. School Grade 

Side Date Transcribed 

Footage: From To Transcriber 



LIST A 


ERRORS 


ERROR CHECK 


NOTES 


1. smoke 








2. frog 








3. string 








4. blaylng 








5 . health 








6. clocks 








7. beige 








8. yawn 








9. cold 








10. noise 








11. girl 








12. quack 








13. glass 








14. vice 








15. tenth 








16. salt 









ia-13 



er|c 



Articulation Test 67-FA (continued) 



LIST A 


ERRORS 


ERROR CHECK 


NOTES 


1 7 , zoo 








1 8 . foot 








19. bloom 








20. throw 








21 . dish 








22. them 








23. scarf 








24. grass 








25 . twine 








26. judge 








27. length 








28. red 








29. birth 








30. spray 








31. church 








32. coins 








33. thumb 








34. wind 








35. star 








36. tree 








37. flowers 








38. mouth 








39 . proud 








40. crib 








41. sleep 








42. drink 








43. splashing 








44. scratch 
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Articulation Test 67-FA (continued) 



LIST A 


ERRORS 


ERROR CHECK 


NOTES 


45. sheep 








46. breathe 








47. swimming 








48. pull 
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Articulation Test 67-FB (continued) 



LIST B 


. ERRORS 


ERROR CHECK 


NOTES 


24. 


voice 








25. 


close 








26. 


scrub 








27. 


reach 








28. 


shirt 








29. 


swinging 








• 

O 

CO 


split 








31. 


just 








32. 


house 








33. 


lens 








• 

CO 


wealth 








35. 


child 








36. 


knob 








• 

CO 


smile 








38. 


bathe 








CO 

CD 

• 


queen 








40. 


truck 








41. 


plum 








42. 


strength 








• 

CO 


twins 








44. 


door 








45. 


blouse 








46. 


six 








47. 


drive 








• 

00 


built 
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